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Abstract. A triangle- free graph G is called read-fc when there 
exists a monotone Boolean formula <p whose variables are the ver- 
tices of G and whose minterms are precisely the edges of G, such 
that no variable occurs more than k times in (f>. The smallest such 
k is called the readability of G. We exhibit a very simple class of bi- 



1.1. Terminology. We consider monotone Boolean formulas — for- 
mulas for short — i.e., formulas built from variables a\, . . . , a n using 
the Boolean operations V and A, which we denote as + and * for con- 
venience. If no variable appears more than k times in 0, we say that 
is read-k. A monotone Boolean function F is said to be read-k if F has 
a logically equivalent read-fc formula. The readability of a monotone 
Boolean function F is the smallest k such that F is read-/c. In gen- 
eral determining the readability of a monotone Boolean function might 
be quite difficult, since to the best of our knowledge it is not known 
whether there is a polynomial-time algorithm which, given a monotone 
Boolean function F in an irredundant DNF or CNF representation, 
decides whether or not F has a read-fc formula, for fixed k > 2. 

Given a formula 0, we can, using distributivity and idempotency, 
write a formula logically equivalent to in the form of sum of products 
of distinct variables, which we call the complete sum of products of 0, 
denoted by CSOP(0). Using the absorption rule a + a*(3 = awe can 
simplify CSOP(0) by eliminating products containing other products, 
obtaining the sum of minterms of 0, denoted by SOP(0). Each formula 
0' logically equivalent to satisfies SOP(0') = SOP(0), so we denote it 
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by SOP(-F), where F is the Boolean function given by 0. For example, 
= a\ * (ai + a 2 ) is read-2, CSOP(0) = a\ + a x * a 2 , and SOP(0) = a±. 

With every monotone Boolean function F on the variables 
we associate a simple graph Gp on the vertex set {a 1; . . . , a n } whose 
edges are the unordered pairs a^j such that and a 3 - occur in the same 
term of SOP(F). Thus each term of SOP(F) induces a clique in Gf- 
For example for F\ = a± * a 2 * a 3 and F 2 = a± *a 2 + 02 * 03 + 03 *ai, both 
G^ and G^ are the triangle on {ai, a 2 , a 3 }. In the other direction, with 
every simple graph G we associate a formula 0(G), which is the SOP 
formula whose terms are the maximal cliques of G. Thus if G is the 
triangle on {a 1; a 2 , 03}, then 0(G) = iq. A monotone Boolean function 
F is said to be normal when SOP(F) = 0(G_p). If G is triangle-free, 
then 0(G) is automatically normal. In that case we say that G is read- 
k if 0(G) is read-fc, and a read-/c formula for 0(G) with the smallest 
possible k is said to be read- optimal for G. This smallest fc is called 
the readability of G. 

For example, if G is a complete bipartite graph G with edges (ify, 

then 0(G) has the read-1 formula (a± + • — h a m ) * (6 a + h b n ). It 

follows that if the edges of a triangle-free graph G can be covered by 
complete bipartite subgraphs in such a way that each vertex belongs 
to at most k of them, then G is read-fc. 

We illustrate these concepts on grid graphs. It is well-known (see 
for example 0]) that a monotone Boolean function F is read-1 if 
and only if F is normal and Gf is a cograph, i.e., GV does not have 
a path on 4 vertices as an induced subgraph. Since grid graphs are 
triangle- free but are not cographs (unless the grid is 1 by 1), they are 
not read-1. On the other hand, it is easy to cover the edges of a grid 
graph G by complete bipartite subgraphs of the form K 2 ^, Ki,i and 
K\ 2 in such a way that each vertex belongs to at most two subgraphs. 
To do this, color the squares of G with black and white as in Chess, and 
for each black square take its bounding cycle. These subgraphs 
cover all the internal edges of G. Then cover the uncovered boundary 
edges with and K\^- This shows that the readability of G is 2. 

Problem 1.1. Is it true that a triangle-free graph G always has a 
read- optimal formula obtained by covering the edges of G with complete 
bipartite subgraphs? 

1.2. Background on readability. We are indebted to G. Turan jH] 
for the following background information on readability of monotone 
normal Boolean functions. Recall that a monotone quadratic Boolean 
function F is normal if and only if Gp is triangle- free. 
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Proposition 1.2. Almost all n-variable monotone quadratic Boolean 
functions have readability 

Proof. 

(1) Let Q n be the number of n-variable monotone quadratic Boolean 
functions. Since every subgraph of a complete bipartite graph 
K ntn is triangle-free, logQ n > c\n 2 for some constant c\ > 0. 

(2) Every monotone formula is associated with a parse tree, with 
variables at the leaves, and + and * internal nodes representing 
the Boolean operations in the formula. The size of the formula 
is defined as the number of nodes in the parse tree. Let M n a be 
the number of of n-variable monotone Boolean formulas of size 
s, and we estimate it as follows. The parse tree is an ordered 
tree, and there are -(^T^) < 2 2s ordered trees with s nodes. 
The tree has at most s internal nodes and at most s leaves. 
Therefore there are at most 2 s ways to assign * or + to the 
internal nodes, and at most n s ways to assign the n variables 
to the leaves. Multiplying everything together, we deduce that 
M n>s < 2 3s n s . Therefore Yf j= o M n,j < Y.]=^ jnj < 2 3s+ V 
for n > 2, and therefore log 5^_ M ni7 - < C2slogn for some 
constant c 2 > 0. 

(3) If s < - e for some e > 0, then by © and {TJ we have 

s 

log^^ M n j < c 2 slogn < c\n 2 — ec 2 logn 

3=0 

< \ogQ n - ec 2 logn, 

or equivalently ^ j= q M "' j — ~~ > 0. Therefore among all n- 
variable monotone quadratic Boolean formulas, the proportion 
of those of size at most s tends to zero. So with probability 1 
an n-variable monotone quadratic Boolean formula has size at 
least — i^— , and therefore readability fi(r^— )• 

C2 log n 1 j \ i g n i 

□ 

No such functions are known explicitly, but there are explicit n- 
variable monotone quadratic Boolean functions with monotone formula 
size O(nlogn)) and thus readability fi(logn)). To explain this, we use 
the concept of graph entropy defined by Korner p. We adopt its 
definition as presented in Newman and Wigderson [8 . The entropy of 
a discrete random variable Z is defined as H(Z) = — J2 Z P( Z ) ^°&2P( Z )^ 
and the mutual information of two random variables X, Y is defined 
as I(X,Y) = H(X) + H(Y) - H({X,Y)). Let A(G) be the set of 
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all maximal stable sets of a graph G = (V,E). Define Q(G) to be 
the set of all probability distributions Qxy on V x A(G) such that 
(a) Qxy(v,I) — if v £ I, (b) the marginal distribution Q x of Qxy 
on V is the uniform distribution on V. Then the entropy of G is 
defined as H(G) = min{I(X,Y)}, where the minimum is taken over 
all random variables X and Y that are distributed according to the 
marginal distributions Qx and Qy of some distribution Qxy £ Q(G). 

Now we use the following three facts. (1) Korner [Uj proved that 
every n vertex graph G satisfies H{G) > log 2 (^jy), where ot(G) is the 

maximum size of a stable set of G. (2) Newman and Wigderson jH] 
proved that if G is an n-vertex graph, the monotone Boolean formula 
size of <f)(G) is at least H(G)n. (3) Using an explicit Ramsey con- 
struction, Alon pP gave explicit n-vertex triangle-free graphs G n with 
a(G n ) = 0(n3). Applying (l)-(3) to G n , we obtain that the monotone 
Boolean formula size of <p(G n ) is fi(nlogn). 

Since an n-vertex bipartite graph G satisfies a(G) > |, it cannot 
satisfy a(G n ) = 0(n l ~ £ ) for any e > 0. Therefore the argument in the 
preceding paragraph cannot use a bipartite graph instead of Alon's G n . 

Jukna [Sj proved that every {C3, Cztj-free graph G = (V, E) has 
monotone Boolean formula size at least \E\/2 and hence readability 
£](|i£|/|V|). Such graphs include many explicit bipartite graphs, and 
also the point-line incidence graphs of the projective planes, for which 
\E\ ~ \V\2- Thus the readability for such graphs can be as high as 

1.3. Results. The graph G(n) is the bipartite graph with vertices 
Xi, . . . ,x n and yi, ■ ■ ■ ,y n whose edges are the pairs with i < j. 
Figure H] illustrates (2(3). 



Vi V2 2/3 




Xi x 2 x 3 



Figure 1. The graph G(3). 
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The graph G(n) is an example of so-called chain graphs ^Hj, also 
known as difference graphs jjj- The most general chain graph is ob- 
tained from G(n) by duplicating vertices, i.e., adding new vertices with 
the same neighbors as existing vertices. It has the same readability as 



Theorem 1.3 (Main Theorem). The readability of G(n) is 



Note that although the lower bound in Theorem 11.31 is smaller than 
the ones mentioned above, the graph G(n) is bipartite (so is not covered 
by the arguments of Alon), has C4S (so is not covered by the results of 
Jukna) and has a very simple and natural structure. In light of this, 
Theorem 11.31 is an interesting result. 

Since G(n) is distance-hereditary, this theorem answers affirmatively 
a question posed in j2]. 

The following result follows from Theorem 11.31 

Theorem 1.4. For each k, the edges of G(n) cannot be covered by 
complete bipartite subgraphs in such a way that each vertex belongs to 
at most k of them, for sufficiently large n. 

On the other hand, Theorem 11.31 follows from Theorem 11.41 if Prob- 
lem 11.11 has an affirmative answer. We give a graph-theoretical proof 
of Theorem II .41 not using Theorem II .31 in the Appendix, which may be 
of independent interest, and served as a starting point of our investi- 
gations. We also show there that G{n) is read-(l + |~log 2 n~|). 

Golumbic, Mintz and Rotics [2] have shown that if F is normal and 
Gp is a partial fc-tree, then F is read-2 fc , and thus has bounded read- 
ability independent of the number of vertices of Gp. Our main theorem 
continues this line of research with a negative result, namely giving a 
very simple family of bipartite graphs with unbounded readability. 



We shall be using Greek letters such as and ip to denote formulas. 
We say that a formula ip is as good as a formula when they are 
logically equivalent and for each variable, the number of its occurrences 
in ip does not exceed the number of its occurrences in 0. 

Each formula is associated with a parse tree, denoted by tree(0), 
with the occurrences of the variables of at the leaves and the oper- 
ations + and * of at the internal nodes. Figure El gives an example. 



G(n). 




2. Proof of the Main Theorem 
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a i 




Figure 2. tree(ai * (a 2 + a 3 * a 4 ) * (a 2 + a 5 )). 



We can simplify tree(0) by eliminating internal nodes corresponding 
to unary + and * operations, i.e., having a single child. Then, using 
distributivity, we can assume that every path down tree(0) alternates 
between + and * nodes; if for example a + node has a + child, remove 
the child and make the grandchildren children of the parent. These 
operations give a logically equivalent formula and do not change the 
number of occurrences of a variable in 0; we always assume they have 
been performed already, as in Figure El 

We say that a variable aj is isolated in a formula if is of the form 

= di + if). 

A subformula of is obtained by taking a node of tree(0), removing 
zero or more of its children but leaving at least two children if the node 
is internal, then taking the entire subtree rooted at the resulting node. 
For example, a% and a\ * (a 2 + as) are subformulas of the formula of 
Figure 121 A subformula if) of is 2-mult if the root of if) is a * node 
and it has exactly two children in tree(0). For example, a 3 * is a 
2-mult subformula of the formula of Figure 121 but a\ * (a 2 + as) is not. 
A formula is said to be non-redundant if it does not have a subformula 
of the form if) — (aj + 0i) * (<2j + 2 ). Since + 0i * 2 is as good as 
if>, every formula can be converted to a non- redundant formula that 
is as good as 0. 

A crucial concept in our proof is that of an extension of G(n). A 
formula is said to be an extension of G(n) or to extend G(n) when 
SOP(0) consists of all the edges of G(n) (i.e., all the terms of the form 
Xi * yj for 1 < % < j < n), and in addition zero or more terms, each 
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of which is a product of two or more Xi variables or two or more yj 
variables. For example, = X\ * (yi + y 2 + y 3 ) + y 3 * (x 2 + x 3 ) + x 2 * 
y 2 + X\ * x 2 * x 3 + y\ * y 3 is an extension of G(3), but i/j — x± * (y± + 
V2 + Vz) +V3* (%2 + x 3 ) + x 2 * y 2 + x 2 * yi * (x 2 + y 3 ) is not, because 
SOP(V') contains the term x 2 *y±, which is neither an edge of G(3) nor 
a product of two or more Xi or yj variables. 

Lemma 2.1. Let <p be a non-redundant extension of G(m). For every 
edge Xi * yj of G{m), has a 2-mult subformula of the form (xi + 0i) * 
(Vj + <h)- 

Proof. Since the term Xi * yj occurs in SOP(0), has a subformula 
of the form = (xi + 0i) * (jjj + <p 2 ) that contributes this term. If 
0' is 2-mult, we are done. If not, this is due to another subformula 
multiplying 0' at the same level of tree(0), in other words, has a 
subformula of the form 0' * -0, and because 0' contributes x^ * yj to 
SOP(0), so does 0' * -0. The formula ip cannot be a leaf of tree(0), 
because such leaf could only be Xi or yj, and this would contradict the 
non-redundancy of 0. Therefore ip is rooted at a + node or at a * node. 
In fact we may assume that ip is rooted at a + node, for if ip has the 
form ijj = -01 * -02, we replace -0 with -0i, and if -01 still is not rooted at 
a * node, we continue this process of taking the first factor. 

By the non-redundancy of 0, -0 is neither of the form Xi + ip\ nor of 
the form yj + ip 2 , and therefore -0 itself contributes Xi * yj to SOP(0). 

We now repeat the same argument on -0, and obtain that ip nas a 
subformula of the form -0' = (x« + -0i) * {yj + ip 2 ) that contributes the 
term Xi * yj to SOP(0). If -0' is 2-mult we are done. If not, we notice 
that because 0' is rooted at a * node and -0 is rooted at a + node, 
the root of -0' is a proper descendant of the root of -0- Therefore our 
argument eventually terminates in a 2-mult subformula of having the 
form (xi + 0i) * (yj + 0' 2 ). □ 

We make the notational convention that whenever we write sets of 
the form {ii,i 2 , . . . , i n } or formulas of the form x^ + x^ + • • • + x^ n ) 
or y i{1) + y i{2) H h yi( n ), we have < i(2) < ■ ■ ■ < i(n). 

Lemma 2.2. For every n there exists m > n such that every non- 
redundant read-k extension of G{m) has a subformula of the form 

(x i{ i) + x i{2) H h x i{ n) + 0i) * (yi(i) + yi{ 2 ) H h yi( n ) + fc)- 

Note that by our notational convention, the subgraph of G{m) induced 
by x i{ i) x i{ n) , yi(i) yi( n ) is isomorphic to G(n) . 

Proof. Given n, we take m as a large enough number, to be speci- 
fied later. Let be a non- redundant read- A; extension of G{m). By 
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Lemma f2. 1| for each of the edges x\ * yj, 1 < j < m of G(m), has a 
2-mult subformula of the form 

if) = (xx + fa) * (yj + fa). 

We say that ip represents the variable yj with respect to x\. It is possible 
that a 2-mult subformula if) of represents two variables, say yn\) and 
Vj(2), with respect to x%, in which case it has the form 

Tp=(xi + fa) * (Vj(l) + Vj{2) + fa)- 

Since X\ occurs at most k times in 0, there must be at least 
variables yi(i), . . . , y^ r M j \ among y 1; . . . y m all represented with respect 

to X\ by the same 2-mult subformula of 0. In other words, has a 2- 
mult subformula of the form 

fa = (xi + fax) * (y i(X) H h ^(Tfl) + ^ 12 )' 

We now consider the variables . . . ,^r a jj. If at least n of them 

occur isolated in x\ + 0n, we are done, so we assume this is not the 
case. Therefore at least n\ = — n of these variables (in fact at 
least ni + 1 of them), call them a^-m, . . . ,Xj( ni \, do not occur isolated 
in xi + 0n. 

We now repeat the argument for the subgraph of G(m) induced by 
Xj(X), Xj( ni ), Vj(i), ■ %-(m)- Consider the edges x^) * Vj(i), 1 < I < 
rii of this subgraph. By Lemma 12.11 and the fact that Xj(i) occurs at 
most k times in 0, there is a set of |~^] variables among yj(%), ■ ■ ■ , yj( ni ), 
say yj'(i), . . . , y^(r«i-K, all represented with respect to by the same 

2-mult subformula of 0. In other words, has a 2-mult subformula of 
the form 

fa = (Xj(l) +02l) * (j/t'(l) H h f»'(r^L]) +022)- 

As before, if at least n of the variables Xi'(i), . . . , x v / rn^-j \ occur isolated 
in Xj(i) +02i, we are done, so we assume this is not the case. Therefore 
at least n 2 = J" 1 ^] — n of these variables, call them Xj>^, . . . , Xj'( n2 )-> do 
not occur isolated in Xjm + fax. And so on. 

If we are not done within k steps, we obtain 2-mult subformulas of 
of the form 

fa = (xi + 0n) * (y t{ i) H + y i( |-™]) + 012), 

with 



{i(l),...,j(n 1 )}c{z(l),...,z([fl)}c{l,...,m}, 
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and the variables £7(1), • • • , Xj(m) do not occur isolated in x\ + 0n; 

0>2 = (a;j(i) + 02i) * (Vi'd) H h + ^22), 

with 

{/(I), • • • , J"(n 2 )} C {i'(l), . . . , i'( p* ])} C 0(1), • • -JM} , 

and the variables Xj>(i), . . . , £j'(n 2 ) do not occur isolated in + 2 i; 
^3 = (aty(i) + 03i ) * (s/i"(i) H h + 032), 

with 

{/(I), . . . ,/(n s )} C (i"(l), . . . ,i"([f ])} C {j"(l), . . . , j'(n 2 )} , 

and the variables ^"(1), • • • , ^"(n 3 ) do not occur isolated in xy^) + 3i ; 
And so on. In the general case we use the notation i^\i^ 2 \... for 
i', i", . . . and similarly for j, and after k steps we obtain 

i) k = (xj(k-2) {1) + <j> kl ) * (j/i(fc-i)(i) h h y^k-i)^^.^ + 0fc2), 

with 

{^(l),...,^- 1 )^)} C {<C*-D(l), . . . , <(*-!)( f=^D} 

c{f 2) (l),.,f 2) W}, ^=m-^ 

and Xj-cfc-i)^), . . . , ^(fc-i)^) do not occur isolated in xyk-2)^ + (pki, 
Each of the variables y^k-i)^, ■ • • , y^k-u^ ^k-i ^ occurs in all the 

subformulas -0i, . . . , ipk- We show that these k subformulas are distinct, 
and therefore each of the above variables already occurs k times in 0. 

For example, we assume that -01 — "02 and obtain a contradiction 
(the argument is the same for ipi = ipj for i < j). Let us denote 

4>1L = Xi + 0ii 

ll>lR = Vi(l) H h2/ i([f]) + 

0>2L = Xj(i) + 021 

1p2R = Vi'(l) H h + 022 

Thus -01 = -0iL * -01^ and -02 = "02L * tyiR- By the definition of -02, the 
variable Xjm does not occur isolated in ipn, but it does occur isolated 
in ■02L- Therefore tpn 7^ ip2L- Since tpi and -02 are 2- mult (they can 
be factored in only one way into two subformulas, up to order), the 
equality ipiL * ifim = "02L * "02R then implies that ipn = i>2R and iftm = 
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ip2L- From ipiL = ip2R it follows that yi>(i) occurs isolated in i/jh, and 
since ■ ■ ■ l" 1 ^] )} C {i(l), • • • , ^([irl }, this variable also occurs 

isolated in ip 1R . Therefore ipi has the form (2/^(1) + 0i) * + $2), 
and this contradicts the assumption that is non-redundant. This 
contradiction proves -01 7^ ip 2 - 

We have shown that each of the variables 

y*"> w , i<j<^- 1} (m) 

already occurs times in 0. We now show that each of the variables 

^ Hj) , l<j<i (k - 1] (\^]) 

occurs isolated in Xj(k-2)^ + (pki- We assume that for some 1 < 
j < l^jr 1 ] ), the variable x^k-i)^ does not occur isolated in 

Xj(k-2)^ + 0fci, and obtain a contradiction. By construction, this vari- 
able also does not appear isolated in any of X\ + 0n, Xj^ + 2 i, • • • , 
Xj(k-3)n\ + 0fc-i,i- Therefore none of the k occurrences of the variable 
Z/i(fc-i)(j) in ipi, ip2, ■ ■ ■ , tpk contributes the term x^k-v,^ * y^k-i)^ to 
SOP(0). Since there are no other occurrences of y^k-i)^ in 0, the edge 
Xj(fc-i)(j) *yj(fc-i)(j) of G(m) does not occur in SOP(0), contradicting the 
assumption that extends G{m). This contradiction confirms that all 
of the variables 

^-»o> i<j<^- i} (m) 

occur isolated in x^-y^ + 4>ki- We conclude that ipk is of the form 

tp k = (x i( k-i) {1) H \-x i(k -i)^^i^ +0')* 

(s/i(*-i)(i) H l-S/i(fc-i)(|-2^i]) + 



^2 J 



To conclude the proof, we need only choose m so large that > 71. 

We have 

m 

ni > n 

k 

n 2 > — - n 



. ™fc-2 

n fe _i > — n. 
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Therefore 

m n n m ( 1 1 \ 

n *-^iFJ-jF5 k- n> k^- n { 1 + k + ¥ + ---) 

m nk m 

It follows that if m > 2nk k , we have > n, as required. □ 

Lemma 2.3. For every n there exists m > n such that every non- 
redundant read-k extension of G(m) has a subformula of the form 

0' = (xi(i) + Xi{2) H h Xi( n ) + 0i ) * (yi(i) + yn2) H h yun) + <j>i) 

with the following property: Let ip denote the formula obtained from 
by substituting a new variable z for 0'. Then SOP(ip) does not contain 
terms of the form z * x^j) or z * for 1 < j < n. 

Proof. We apply Lemma 12.21 for n + 2 and conclude that there exists 
m > n + 2 such that every non-redundant read-fc extension of G(m) 
has a subformula of the form 

0' = H h £j(n+2) + 01 ) * (Vi(l) H h 2/i(n+2) + 02)- 

Define new indices = i(2), j(2) = i(3), . . . , j(n) = i(n + 1), so 
that 0' takes the form 

0' = (x j( i) + • • • + x j{n) + 0' x ) * (y i( ij + • • • + y i(n ) + 0' 2 ), 

where (p[ = x i{1) + x i(n+2 ) + 0i and 2 = y i{1) + y i{n+ 2) + 02- 

We assume that for some 1 < s < n the term z * x j(s) occurs in 
SOP(^) and obtain a contradiction. Replacing z with 0' and expanding 
0', we obtain a term y^x) * Xj(s) m CSOP(0). This term remains in 
SOP(0), because the latter does not have terms of the form y^x) or 
Xj( a ) that could absorb y^x) * Xj( s )j srnce i s an extension of G{m). 
Again, since is an extension of G(m), we obtain that y^x) * Xjt 8 \ is an 
edge of G(m), a contradiction. 

Similarly no term of the form z * yj( s \ occurs in SOP (■?/>) • □ 

Lemma 2.4. Suppose G(n) has a read-k extension having a subfor- 
mula of the form 

0' = (xx + x 2 H h x„ + 0i) * {yx + yi H h y n + 02) 

with the following property: Let 0" denote the formula obtained from 
by substituting a new variable z for 0'. Then SOP(<p") does not contain 
terms of the form z * x { or z * y,j. 

Then G(n) has a read-(k — 1) extension. 
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Proof. We call a minterm that is a product of both x and y variables 
mixed. So by definition, the mixed minterms of an extension of G(n) 
are precisely the edges of G(n). 

Let ip be the formula obtained from by substituting 1 (i.e., a true 
value) for 0'. Since each variable xi, . . . , x n , yi, ■ ■ ■ ,y n occurs in 0', each 
variable occurs in ip less often than in 0. Therefore -0 is read- (A; — 1). 
To complete the proof, we will show that ip extends G(n). 

Assertion 1: The term z does not occur in SOP(0"), for otherwise we 
expand z and obtain the term x 2 * Hi in CSOP(0). This term remains 
in SOP(0) because extends G(n), but this implies that G(n) has the 
edge x 2 *yi, a contradiction. 

Assertion 2: No terms of the form Xi or yj occur in SOP(-0). We 
assume for example that the term Xi occurs in SOP(-0) and obtain a 
contradiction. Since Xi is in SOP(-0), it follows that the term Xi or the 
term z * Xi is in SOP(0"). The hypothesis rules out the latter, so the 
former holds. But this implies that x^ is in SOP(0), which contradicts 
the assumption that extends Gin). 

Assertion 3: All the mixed terms of SOP(-0) are quadratic, i.e., of the 
form Xi * yj. We suppose that a non-quadratic mixed term A occurs 
in SOP(-0) and obtain a contradiction. Either A or z * A occurs in 
SOP(0"). 

The first case is that A occurs in SOP(0"). Since extends G(n), 
A does not occur in SOP(0). Therefore A is absorbed by a proper 
subterm B occurring in SOP(0). This B does not occur in SOP(0"), 
or else it would also absorb A in SOP(0"). It follows that B is obtained 
in SOP(0) by multiplying some term of CSOP(0') with some subterm 
B' of B. It follows that some subterm of B' occurs in SOP(-0). Since B' 
is a proper subterm of A, A does not appear in SOP('0), a contradiction. 

The second case is that z * A occurs in SOP(0"). By the forms of 
0' and A we have 0' * A = A. Therefore we see that after substituting 
0' for z, some subterm B of A occurs in SOP(0). B must be a proper 
subterm of A since extends G(n), and thus all mixed terms of SOP(0) 
are quadratic. Then either B or zB' with B' a subterm of B occurs in 
SOP(0"), and in both cases a subterm of B occurs in SOP(/0). Since B 
is a proper subterm of A, A cannot occur in SOP(-0), a contradiction. 

Assertion 4'- SOP(0) and SOP(-0) have the same mixed terms. 

Let A be a mixed term occurring in SOP(0). Then A has the form 
Xi * yj. The first case is that A occurs in SOP(0"). In this case a 
subterm B of A occurs in SOP(/0), but B cannot be a proper subterm 
of A by Assertion 2, so A occurs in SOP(-0). The second case is that 
A does not occur in SOP(0"). In that case A appears in SOP(0) as 
a result of multiplying 0' by some other formulas. Thus SOP(0") has 
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a term z * B where B is a subterm of A. This B cannot be a proper 
subterm of A by Assertion 1 and the hypothesis that z*Xi and z* yj do 
not occur in SOP(0"). Therefore B = A and z * A occurs in SOP(0"). 
Substituting z = 1 we see that a subterm of A occurs in SOP(?/>), and 
this subterm must be A itself by Assertion 2. 

Conversely, let A be a mixed term occurring in SOP (?/>). By Asser- 
tion 3 A must be quadratic, i.e., A has the form X{ * yj. The first case 
is that A occurs in SOP(0"). In this subterm of A occurs in 

SOP(0), and this subterm must be A itself because extends G(n). 
The second case is that A does not occur in SOP(0"). In that case the 
term z * A occurs in SOP(0"). Substituting 0' for z we see that the 
terms of CSOP(0' * A) occur in CSOP(0). But by the forms of 0' and 
A we have <ft' * A = A. Therefore a subterm of A occurs in SOP(0). 
Again, by the form of A and the hypothesis that extends G(n), this 
subterm is A itself. 

We have proven Assertion 4, and therefore, since extends G(n), so 
does ip, as required. □ 

Theorem 2.5. If G(n) has no read-(k-l) extension, then there exists 
m > n such that G(m) has no read-k extension. 

Proof. Suppose the conclusion of the theorem fails, i.e., for each m > 
n, G(m) has a read-k extension. Let m > n be the value given by 
Lemma 12.31 for n. By our supposition G(m) has a read-fc extension 
p. We can find a non-redundant formula that is as good as p. In 
particular is read-fc, and SOP(0) = SOP(p), so that is also an 
extension of G(m). By Lemma [2.31 has a subformula of the form 

0' = + %i(2) H h %i(n) + 0l) * {Vi{l) + Vi(2) H V Vi(n) + 02) 

with the following property: Let 0" denote the formula obtained from 
by substituting a new variable z for 0'. Then SOP(0") does not 
contain terms of the form z * x^ or z * y^ for 1 < j < n. 

Let ip denote the formula obtained from by substituting zero (i.e., 
false) for all variables except x^i), . . . , x^ n ), yi(t), . . . , yun) and renum- 
bering . . . , i(n) as 1, . . . , n. Then ip is read-fc. Since extends 
G(m), the mixed terms of SOP(0) are precisely the edges of G{m). 
Only the edges induced by xi, . . . , x n and yi, . . . , y n (in the new num- 
bering) survive the substitution, and these edges form G(n). No new 
non-mixed terms appear as the result of the substitution. Therefore ip 
extends G(n). 

let ip' be obtained from 0' by the same substitution and renumbering. 
Then if)' is a subformula of ip of the form 

i// = (xi + x 2 H h x n + ipx) * (yi + y 2 H h y n + 



14 MARTIN CHARLES GOLUMBIC, URI N. PELED, AND UDI ROTICS 

with the following property: Let if)" denote the formula obtained from 
ip by substituting a new variable z for ip' . Then SOP(^") does not 
contain terms of the form z * Xj or z * yj for 1 < j < n. Indeed, 
suppose z*xj occurs in SOP(ip"). Since it does not occur in SOP(0"), 
a proper subterm, i.e., either z or Xj, occurs in SOP(0"). It follows 
that either a subterm of X2*yi or the term Xj occurs in SOP(0), which 
is impossible since <fi extends G(m). 

We have shown that ip and ip' satisfy the hypothesis of Lemma \2A\ 
so by its conclusion G{n) has a read-(A; — 1) extension, contradicting 
the hypothesis of the theorem. □ 

Corollary 2.6. For each k, G(m) has no read-k extension for m suf- 
ficiently large. 

Proof. By Theorem 12.51 and the fact that G(2) has no read-1 exten- 
sion, it follows that there exists an m such that G(m) has no read- A; 
extension. If G(m + 1) had a read- A; extension, we would obtain from 
it a read- A; extension of G{m) by substituting zero for for x m+ i and 
Vm+x- n 
Corollary 2.7. For each k, G(m) is not read-k form sufficiently large. 

Proof. This follows from Corollary 12.61 since every formula for G(m) is 
an extension of G(m). □ 

To prove our main theorem, we analyze the proofs above to find out 
how large they require m to be for a given k. 

Proof, (of Theorem ll.3J) It follows from the proofs of Lemma l2~2*l through 
Corollary 12 . 61 that if G(n) has no read-(fc — 1) extension and m > 2nk k , 
then G(m) has no read-fc extension. Since G(2) has no read-1 exten- 
sion, it follows by induction on k that G(2 k ■ l l 2 2 ■ ■ ■ (k — has no 
read-k extension, and therefore it is not read-A;. Since 2 k ■ 1 1 2 2 ■ ■ ■ (k — 
x jfc-i < 1 i 2 2 . . . k k^ it fon ows that if l x 2 2 • ■ • k k < n, then G(n) is 
not read-/c. We use the estimate log(l 1 2 2 ■ • • k k ) < k 2 logk. If we 



this k, G(n) is not read-A;; in other words, the readability of G(n) is 



We denote by r n the smallest k such that the edges of G{n) can be 
covered by complete bipartite subgraphs in such a way that no vertex 
belongs to more than k subgraphs. Equivalently, r n is the smallest 
number k such that we can give to each vertex of G(n) at most k 




Therefore for 




□ 



3. Appendix 
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colors in such a way that X{ and yj share a color if and only if i < j, 
i.e., if and only if Xi * i/j is an edge of G(n). In that case we say that we 
have represented G(n) with these colors. The total number of colors 
used does not matter, only how many colors each vertex receives. As we 
mentioned in the Introduction, r n is an upper bound for the readability 
ofG(n). 

Proposition 3.1. r n < r ra+1 . 

Proof. This follows trivially from the fact that G(n) is an induced sub- 
graph of G(n +1). □ 

Lemma 3.2. r n+m < 1 +r max(n>m) . 

Proof. Assume without loss of generality that n < m. Consider G(n + 
m). The subgraph G\ induced by x±, . . . ,x n and yi, . . . ,y n is G(n), 
and the subgraph G 2 induced by x n+1 , x n+m and y n+1 , y n+m 
is isomorphic to G(m). Let k = r m . We represent G 2 with a set 
of colors so that each vertex of G 2 receives at most k colors. Since 
r n < k by Proposition 13.11 we can represent G\ by a set of new colors 
so that each vertex of G\ receives at most k colors. Since no color is 
common to G± and G 2 , we have not represented the non-existing edges 
between y%, . . . ,y n and x n+ i, . . . , x n+m . Finally we give a new color 
to the vertices xi, . . . ,x n and y n +i, ■ ■ ■ ,y n +m to represent the edges 
between x\, . . . , x n and y n +i, ■ ■ ■ , y n +m- This coloring represents G{n + 
m) and gives at most k + 1 colors to each vertex. □ 

Corollary 3.3. r 2q < q + 1, or equivalently by Proposition ^. 11 r n < 
1 + [log 2 n\ . 

Proof. This follows from Lemma 13.21 and r\ = 1 . □ 
Lemma 3.4. If r n > k, then r( 2 fc+i) n > k + 1. 

Proof. We assume that r n > k but r^ 2k+ i) n < k and obtain a con- 
tradiction. By Proposition 13.11 we have k < r n < r^k+i)n < k, and 
consequently 

r n = r (2k+1 ) n = k. 

Let G = G((2k + l)n), and consider a coloring representing G with at 
most k colors present at each vertex. We divide up G into 2/c+l induced 
subgraphs G\, G 2 , . . . , G 2k +i isomorphic to G(n), Gi being induced by 
the vertices x^_ 1)n+1 , ...,x in and y^_ 1)n+1 , . . .,y in , 1 < i < 2k +1. We 
call {x(i-i) n+ i, . . . ,x in } and {y^x) n+ x, . . . , y in } the opposite sides of 
Gi. 

The coloring of G also represents Gi. This coloring still represents 
Gi if at each vertex of Gi we keep only the colors that appear in the 
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opposite side of G«. If the resulting coloring has fewer than k colors 
present at each vertex of Gi, then r n < k, a contradiction. Therefore 
Gi has a vertex with k colors, all appearing in the opposite side of Gi. 
We call such a vertex a distinguished vertex of Gi. 

Assertion 1: It is impossible that Gi has a distinguished vertex x p 
and Gi+i has a distinguished vertex y q . We suppose such distinguished 
vertices exist and obtain a contradiction. The edge x p * y q of G ne- 
cessitates a common color to x p and y g . Since x p is distinguished, this 
color is present at some vertex y r of Gi, and since y g is distinguished, 
this color is present at some vertex x s of G i+ i. This contradicts the 
non-existence of the edge x s * y r , proving Assertion 1. 

Assertion 2: It is impossible that Gi, Gi + \, . . . , Gi + k all have distin- 
guished vertices on the same side. Assume for example that Gj has a 
distinguished vertex yau) for each i < j < i + k (the argument is similar 
if Gi, Gi + \, . . . , Gi + k all have distinguished vertices on the x side). Since 
yd(j) is distinguished, all the k colors present at appear on the x side 
of Gj. Therefore they cannot be present at y^ for any i < I < j — 1, 
or else a non-existing edge of G would appear. It follows that each 
distinguished vertex y d (j) has k colors that are not present at any other 
distinguished vertex yd(j'), j' ^ j- Now consider the vertex XdU). Since 
it is adjacent to the k distinguished vertices yd(i+i), ■ ■ ■ , yd(i+k), it has a 
common color with each of them. This already gives to XdU) k distinct 
colors that are not present at the distinguished vertex yd(i)- Since x^i) 
has no other colors, the edge Xd(i) * yd{i) is missing, a contradiction. 
This proves Assertion 2. 

As a consequence of Assertion 1, there exists an index < L < 2k + 1 
such that G\, . . . , Gl have distinguished vertices only on the y side and 
not on the x side, whereas Gl+i, ■ ■ ■ , G 2 k+i have distinguished vertices 
only on the x side and not on the y side. As a consequence of Assertion 2 
we have both L < k and 2k + 1 — L < k, a contradiction, which proves 
the lemma. □ 

Since r\ = 1, Lemma 13.41 gives r&i > 2, r^.^.i > 3, and in general 
r (2fc-i)!! > k, where (2k — 1)!! = (2k — 1) • (2k — 3) • • - 3 • 1. This proves 
Theorem 11.41 
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